Multi-label Learning with Semantic Embed- Dings

نویسندگان

  • Liping Jing
  • MiaoMiao Cheng
  • Michael W. Mahoney
چکیده

Multi-label learning aims to automatically assign to an instance (e.g., an image or a document) the most relevant subset of labels from a large set of possible labels. The main challenge is to maintain accurate predictions while scaling efficiently on data sets with extremely large label sets and many training data points. We propose a simple but effective neural net approach, the Semantic Embedding Model (SEM), that models the labels for an instance as draws from a multinomial distribution parametrized by nonlinear functions of the instance features. A Gauss-Siedel mini-batch adaptive gradient descent algorithm is used to fit the model. To handle extremely large label sets, we propose and experimentally validate the efficacy of fitting randomly chosen marginal label distributions. Experimental results on eight real-world data sets show that SEM garners significant performance gains over existing methods. In particular, we compare SEM to four recent state-ofthe-art algorithms (NNML, BMLPL, REmbed, and SLEEC) and find that SEM uniformly outperforms these algorithms in several widely used evaluation metrics while requiring significantly less training time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Multi-label learning has attracted significant interests in computer vision recently, finding applications in many vision tasks such as multiple object recognition and automatic image annotation. Associating multiple labels to a complex image is very difficult, not only due to the intricacy of describing the image, but also because of the incompleteness nature of the observed labels. Existing w...

متن کامل

Transductive Multi-label Zero-shot Learning

Zero-shot learning has received increasing interest as a means to alleviate the often prohibitive expense of annotating training data for large scale recognition problems. These methods have achieved great success via learning intermediate semantic representations in the form of attributes and more recently, semantic word vectors. However, they have thus far been constrained to the single-label...

متن کامل

Learning Distributed Document Representations for Multi-Label Document Categorization

Multi-label Document Categorization, the task of automatically assigning a text document into one or more categories has various real-world applications such as categorizing news articles, tagging Web pages, maintaining medical patient records and organizing digital libraries among many others. Statistical Machine Learning approaches to document categorization have focused on multi-label learni...

متن کامل

Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we formulate the task of image annotation as a multi-label multi class semantic image classification problem and propose a simple yet effective m...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016